Supervised/Unsupervised Voice Activity Detectors for Text- dependent Speaker Recognition on the RSR2015 Corpus

نویسندگان

Md Jahangir Alam

Patrick Kenny

Pierre Ouellet

Themos Stafylakis

Pierre Dumouchel

چکیده

Voice activity detection, i.e., discrimination of the speech/nonspeech segments in a speech signal, is an important enabling technology for a variety of speech-based applications including the speaker recognition. In this work we provide a performance evaluation of the following supervised and unsupervised VAD algorithms in the context of text-dependent speaker recognition on the RSR2015 (Robust Speaker Recognition 2015) task : Energy-based VAD with and without hangover scheme and endpoint detection, vector quantizationbased VAD, Gaussian mixtures model (GMM)-based VAD (both supervised and unsupervised way), and sequential GMM-based VAD. Experimental results show that both the supervised and unsupervised GMM-based VADs perform better than the other VAD algorithms. Considering all three evaluation metrics (equal error rate, old (SRE 2008) and new (SRE 2010) normalized detection cost functions) unsupervised GMM-based VAD performed the best.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extended RSR2015 for text-dependent speaker verification over VHF channel

Text-dependent speaker verification over degraded radio channel is a challenging task. To better understand the research problem, the Institute for Infocomm Research (I2R) of Singapore has collected a corpus of voice recordings transmitted over marine VHF. Built as an extension of the RSR2015 database, the VHF-RSR2015 consists of recordings from 300 speakers of Part I of the RSR2015 database tr...

متن کامل

The RSR2015: Database for Text-Dependent Speaker Verification using Multiple Pass-Phrases

This paper describes a new speech corpus, the RSR2015 database designed for text-dependent speaker recognition with scenario based on fixed pass-phrases. This database consists of over 71 hours of speech recorded from English speakers covering the diversity of accents spoken in Singapore. Acquisition has been done using a set of six portable devices including smart phones and tablets. The pool ...

متن کامل

RSR2015: Database for Text-Dependent Speaker Verification using Multiple Pass-Phrases

متن کامل

Vulnerability evaluation of speaker verification under voice conversion spoofing: the effect of text constraints

Voice conversion, a technique to change one’s voice to sound like that of another, poses a threat to even high performance speaker verification system. Vulnerability of text-independent speaker verification systems under spoofing attack, using statistical voice conversion technique, was evaluated and confirmed in our previous work. In this paper, we further extend the study to text-dependent sp...

متن کامل

Singing speaker clustering based on subspace learning in the GMM mean supervector space

In this study, we propose algorithms based on subspace learning in the GMM mean supervector space to improve performance of speaker clustering with speech from both reading and singing. As a speaking style, singing introduces changes in the time-frequency structure of a speaker’s voice. The purpose of this study is to introduce advancements for speech systems such as speech indexing and retriev...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Supervised/Unsupervised Voice Activity Detectors for Text- dependent Speaker Recognition on the RSR2015 Corpus

نویسندگان

چکیده

منابع مشابه

Extended RSR2015 for text-dependent speaker verification over VHF channel

The RSR2015: Database for Text-Dependent Speaker Verification using Multiple Pass-Phrases

RSR2015: Database for Text-Dependent Speaker Verification using Multiple Pass-Phrases

Vulnerability evaluation of speaker verification under voice conversion spoofing: the effect of text constraints

Singing speaker clustering based on subspace learning in the GMM mean supervector space

عنوان ژورنال:

اشتراک گذاری